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ABSTRACT 

Malaria is a lethal human disease in terms of morbidity and mortality and is caused by Plasmodium species. Main obstacle in 
conquering this disease is their rapidly evolving genetic structure. Recently published whole genome of Plasmodium vivax provides 
an opportunity to compare it in depth with the previously published genome of Plasmodium falciparum. Here we have compared 
each and every chromosome of both the species and found the surprising presence of an unknown bases N. By analyzing tandem 
repeats it was clear that its copy number is always high in case of P.falciparum and it has strong correlation with their chromosome 
length. All the vir and var genes are also analyzed and very little sequence homology found between them. In order to find the 
functional role of vir and var gene protein products, amino acid composition is analyzed. The frequency of hydrophilic amino acids 
is found more in comparison with the hydrophobic one. Phylogenetic analysis of mitochondrial cytochrome b and nuclear encoded 
High mobility group binding (HMGB) proteins was done. On the basis of phylogeny result of HMGB it was found that this 
transcription factor can act as a reference point for the development of separate drugs for P.falciparum and P. vivax and this 
strategy is supposed to be more reliable in the eradication of malaria. 


Key words: Allelic diversity, phylogeny, tandem repeats, vertical transfer, invasiveness. 


INTRODUCTION 

Malaria is one of the most common infectious disease 
and an enormous health problem. Each year, up to 
three million deaths due to malaria and close to five 
billion episodes of clinical illness possibly meriting 
anti-malarial therapy occur throughout the world, with 
Africa having more than 90% of this burden[1,2]. 
Almost 3% of disability adjusted life years are due to 
malaria mortality globally. Malaria occurs in the 
widespread parts of the Americas, Asia, and Africa. 
Ninety percent of malaria-related deaths occur in Sub- 
Saharan Africa. Malaria is commonly associated with 
poverty, but is also a cause of poverty and a major 
hindrance to economic development [1, 3]. 


Malaria is caused by a number of species of 
Plasmodium, among them the Plasmodium falciparum 
is the deadliest one [4]. This most virulent form causes 
malaria by invading both the reticulocytes and mature 
erythrocytes, while it’s another species Plasmodium 
vivax is less virulent form, restricted to only 
reticulocytes [5]. From the starting a lot of research is 
going on in this field to eradicate the disease but our 
scientific community is little bit successful in their 
efforts. One of the main reasons behind this low 
successful rate is the rapidly evolving genetic structure 
of Plasmodium parasite itself. A lot of drugs were 
made to combat this disease but maximum of them 
fails in their initial clinical trial phases. This mainly 
happens due to development of resistance against anti 
malarial drugs. Till date no effective vaccines are not 
available for malaria [6,7,8], and also the available 


drugs in the market usually become less effective due 
to the high rate of antigenic variations [9]. 


Recently the whole genome sequence information on 
Plasmodium vivax became available [10]. On 
combining this information with the already available 
whole genome information on one of the deadliest 
species, Plasmodium falciparum [11] have opened up 
exciting research avenues in analyzing their pathogenic 
role at the genomic level. Because of fast throughput 
sequencing methods, a new era of in silico analysis 
started which can help in studying both the genomes 
comprehensively. In this study we focused on all the 
14 chromosome of Plasmodium falciparum and 
Plasmodium vivax species and did their comparative 
studies of two genes which are mainly responsible for 
high rate of antigenic variations. The main differences 
in the genome of the two species lie in the Adenine 
and Thymine (A+T) composition, P.vivax has 
approximately 55% and _ P-.falciparum has 
approximately 80%, being the second highest AT-rich 
genome [4, 12]. 


Among a number of virulence factors responsible for 
antigenic variation, the Erythrocyte Membrane 
Protein! is most prominent one. This variant protein is 
encoded by var genes in the whole genome [3, 13, 
14]. These var genes are mostly found in the telomeric 
and sub-telomeric regions; exceptions are chromosome 
4, 7, 8 and 12 in which they specifically lie in the 
central region [www.vardb.org, 15, 16]. 
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Phylogenetic analysis of Plasmodium species is well 
studied previously for revealing the taxonomical status 
of the apicomplexan parasite. For this analysis the 
highly conserved mitochondrial genomes or proteins 
are taken into account due to their vertical transfer by 
means of maternal inheritance. Despite of these facts 
we proposed a new strategy of phylogenetic analysis 
based on medium level of sequence conservation of 
transcription factors for the sake of designing more 
potent vaccine against malaria. 


MATERIALS AND METHODS 

DNA sequence of all the 14 chromosomes of 
P.falciparum and P.vivax were collected from the 
Gene bank in NCBI (www.ncbi.nlm.nih.gov). var 
genes that are responsible for the antigenic variation in 
P.falciparum are taken from the PlasmoDB 
(www.plasmodb.org/plasmo/ ). This is the database 
that contains all the proteomic and genomic 
information regarding the P.falciparum. The percent 
GC and AT content of all the chromosome of both the 
species were calculated by writing a program with the 
help of MATLAB version 7.4 (www.mathworks.com 
). We have selected all the var genes that are scattered 
throughout the genome of P.falciparum. To study the 
P.vivax genome, all the vir genes are taken from the 
NCBI database (www.ncbi.nlm.nih.gov). To check the 
sequence similarity among the foresaid genes, we have 
used the Clustal W [17]. 


In order to detect the tandem repeats in the whole 
genome of P.vivax and P.falciparum, we used the 


No. of Tandem Repeats 
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Tandem Repeats finder [18]. This software is used for 

comparing the frequency of occurrence of tandem 

repeats in each and every chromosome of both the 

species on the basis of their sequences, lengths, and 

copy numbers. Correlation analysis and curve fitting of 

tandem repeats have been done [19, 20]. Percent 

compositions of each of the four nucleotides are also 
calculated by using the same. 


The protein sequences that are encoded by var and vir 
genes were obtained from NCBI 
(www.ncbi.nlm.nih.gov/nucleotide). The amino acid 
composition of these protein sequences was detected 
by using the MEGA 4.0 [21, 22]. 


We downloaded the highly conserved Cytochrome b 
mitochondrial protein of different Plasmodium species. 
Next we selected High mobility group binding protein 
(HMGB) which is one of the most conserved 
transcription factors in Plasmodium species [23]. Both 
the Cytochrome b and HMGB were extracted from the 
NCBI database 


(www.ncbi.nlm.nih.gov/ protein). In order to find the 
similar sequence in different Plasmodium species, we 
used the PSI-BLAST program [24]. Complete 
sequences were aligned in CLUSTAL W using the 
default parameters. We performed the Maximum 
parsimony method for constructing the phylogenetic 
tree. Maximum parsimony analysis was performed on 
MEGA 4.0 [21, 22]. A total of 100000 replicates were 
carried out for the bootstrap analysis. 


35 
x 10° 


2 25 3 
Chromosome Length 


Figurel. (a) Number of tandem repeats vs. chromosome length and the Best Fit for P.falciparum 
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Chromosome Length x 10° 
Figure1. (b) Number of tandem repeats vs. chromosome length and the Best Fit for P.vivax. 


RESULTS 

A comparison of whole genome sequence information 
of each and every chromosome of P.falciparum and 
P.vivax (Table 1) reveals several interesting features of 
the two respective genomes. Although both the species 
have same number of chromosomes, their genome size 
is different. 


The genome size of P. vivax is larger but when we 
goes through analyzing each chromosome, we find that 
the length of chromosome number 2, 3, 4, 6, 10, 13 
and 14 is larger in P.falciparum. Comparing A+T 
content in both the species by our program gave some 
results that are different from those quoted in NCBI 
(www.ncbi.nlm.nih.gov/protein). This is basically due 
to occurrence of some mysterious bases N’s found 
apart from A, T, C, and G bases. 


Analyzing the vir and var gene in P.vivax and 
P.falciparum respectively, we find very little sequence 
similarity. The mean length of var genes are also more 
when compared with the vir one. 


In order to identify the tandem repeats in each 
chromosome, we found that numbers of tandem 
repeats are relatively high in case of P.falciparum 
[Tablel]. Tandem repeats are showing _ strong 
correlation with the chromosome length in both the 
species [Figure 1].In case of P.falciparum the total 
number of repeats was 75,952, while in P.vivax its 
number was 28,817. For finding tandem repeats in vir 
and var gene, we apply similar strategy and found that 


P.falciparum var genes are found to be much longer in 
length and contains 724 tandem repeats in comparison 
with that of P.vivax vir genes, which contain 14 
tandem repeats. 


All the protein sequences encoded by vir and var genes 
were analyzed. Very little similarity was found among 
the var gene products and vir gene products also. The 
amino acid composition is very important in 
determining the functional role of these proteins [25] 
.So overall amino acid composition is calculated with 
the help of MEGA 4.0 [21, 22] and found that most of 
the var and vir gene products contains gluatamate, 
lysine, asparagine and serine in large proportion [Table 
2, 3]. 


The normalized average frequency of each hydrophilic 
amino acid is higher for Vir and Var proteins than it 
should be in case of unbiased statistical distribution of 
amino acids. 


To study the taxonomical status of P.falciparum and 
P.vivax, the phylogenetic tree is reconstructed with the 
help of more conserved cytochrome b mitochondrial 
protein (Figure 1). This tree (Figure 3 a) shows the 
taxonomical position of both the species vis-a-vis each 
other. Among the various transcription factors we have 
used High mobility groupbinding protein (HMGB) for 
analyzing phylogenetic relationship due to their high 
level of conservation. The phylogenetic tree 
reconstructed by taking HMGB shows a distant 
relationship among the P. falciparum and P.vivax 


International Journal of BioSciences and Technology (2010), Volume 3, Issue 4, Page(s): 46 - 54 


ISSN: 0974 - 3987 
UBST (2010), 3(4):46-54 


4 P. chabaudi chabaudi 
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P.knowlesi 
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Babesia bows 12Bo 


Thellenia parva strain Muguga 


Figure 3. (a) Phylogenetic tree derived from an alignment of selected Cytochrome b mitochondrial protein. 


99 P.yoelii yoelii str. 
68 P.berghei strain ANKA 
7 P falciparum 3D7 
P.chabaudi chabaudi 
99 P.vivax Sal-1 
68 P knowles! 
Theileria parva strain Muguga 


Babesia bous 12Bo 


Figure 3. (b) Phylogenetic tree derived from an alignment of selected nuclear encoded High mobility group binding protein. 
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Table _1. Comparative analysis of various genomic features of P. vivax and P. falciparum 


Chromosome Number Characters P.vivax P. falciparum 
Size (bp) 830,022 643,292 

1 % AT 52.88 79.45 
No. of Genes 176 157 

No. of Tandem repeats 1,129 2,221 

Size (bp) 755,035 947,102 

2 % AT 55.13 80.25 
No. of Genes 162 223 

No. of Tandem repeats 1,043 3,106 

Size (bp) 1,011,127 1,060,087 

3 % AT 55.18 80.12 
No. of Genes 220 247 

No. of Tandem repeats 3,674 

Size(bp) 876,622 1,204,112 

4 % AT 54.86 79.32 
No. of Genes 208 254 

No. of Tandem repeats 4,000 

Size (bp) 1,370,936 1,343,552 

5 % AT 55.72 80.67 
No. of Genes 316 330 

No. of Tandem repeats 1,806 4,632 

Size(bp) 1,033,388 1,418,244 

6 % AT 54.08 80.21 
No. of Genes 248 322 

No. of Tandem repeats 1,468 4,635 

Size(bp) 1,497,819 1,351,552 

7 % AT 54.37 80.02 
No. of Genes 353 294 

No. of repeats 2,047 4,458 

Size (bp) 1,678,596 1,325,595 

8 % AT 54.60 80.29 
No. of Genes 378 299 

No. of Tandem repeats 4,690 

Size (bp) 1,923,364 1,541,723 

9 % AT 53.91 80.98 
No. of Genes 434 367 

No. of Tandem repeats 5,353 

Size (bp) 1,419,739 1,694,445 

10 % AT 55.05 80.31 
No. of Genes 327 404 

No. of Tandem repeats 1,897 5,572 

Size (bp) 2,067,354 2,035,250 

11 % AT 54.92 81.04 
No. of Genes 468 490 

No. of Tandem repeats 2,429 6,886 

Size (bp) 3,004,884 2,271,916 

12 % AT 55.37 80.69 
No. of Genes 695 530 

No. of Tandem repeats 7,550 

Size (bp) 2,031,768 2,732,359 

13 % AT 54.35 80.85 
No. of Genes 446 667 

No. of Tandem repeats 8,535 

Size(bp) 3,120,417 3,291,006 

14 % AT 56.99 81.56 
No. of Genes 699 771 

No. of Tandem repeats 3,558 10,640 
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Table 2. Amino Acid composition of vir gene encoded proteins in Plasmodium vivax 


Accession No. 


XP 0016086111 


XP 001608613. 


XP 001613646. 


XP 001612659. 


XP 001612596. 


XP 001612578. 


XP 001612579. 


XP 001612835. 


XP 001612837. 


XP 001612937. 


XP 001612938. 


XP 0016129411 


XP 001608558.1 


XP 001614849.1 


XP 001614850.1 


XP 001614853.1 


XP 001614854.1 


XP 001615 150.1 


XP 0016 ISIS LI 


XP 001612686. 


XP 001612688. 


XP 001608384.1 


XP 0016083 12. 


XP 0016083 16. 


XP 001608317. 


XP 001608318. 


XP 001608320.1 


XP 001608323.1 


XP 001614319.1 


XP 001614320.1 


XP 0016143211 


XP 001613040.1 


XP 001613 165.1 


XP 001613 166.1 


XP 001616487. 


XP 001612629. 


XP 001612630. 


XP 0016126311 


XP 001612632. 


XP 00161434.1 


XP 001614059. 


Avg 


3.23 


158 


6.11 


2.89 


Lil 


195 


4.09 


2.05 


a) Number of hydrophilic amino acids is high. 
b) All the data given here shows the percent composition. 
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6.96 


7.62 


7.19 


6.44 


Lys 


84 


2.1 


1168 


12.25 


244 


1L75 


1L98 


1169 


12.72 


2.09 


2.28 


1126 


104 


10.99 


9.24 


8.1 


8.66 


10.07 


4.02 


2.15 


2.67 


10.94 


10.86 


8.6 


1141 


2.33 


5.3 


B.92 


10.61 


1164 


ILI 


Leu 


6.56 


6.68 


9.74 


7.66 


Met 


145 


2.33 


1.48 


Different types of Amino Acids composition 


Asn 


9.43 


645 


9.54 


83 


5.2 


Pro 


3.48 


3.02 


4.12 


4.03 


2.35 


451 


6.85 
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3.12 


179 


191 


3.47 


2.26 


195 


3.17 


3.07 


6.65 


74 


6.05 


7.09 


5.67 


541 


4.71 


4.78 


6.98 


3.91 


451 


4.83 
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0.81 


0.65 


0.98 


0.63 


155 


0.62 


0.22 


0.64 


12 


14 


0.69 


129 


0.63 


0.76 


0.64 


0.94 


Lil 


134 


0.69 


0.75 


0.54 


1B 


0.68 


114 


0.56 


132 


4.16 


8.28 


4.75 


8.38 


4.08 


749 


749 


6.26 


6.36 


748 


30 


6.9 
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Table 3. Amino Acid composition of var gene encoded proteins in Plasmodium falciparum 


. Different types of Amino Acids composition 
Accession No. 


Ala Cys Asp Glu Phe Gly His Tle Lys Leu Met Asn Pro Gin Arg Ser Thr Val Trp Tyr 


XP 001350936. 4.72, 3.33 8.55 8.18 2.64 6.2 1.66 4.76 1.05 5.13 29° 7.17 546 4.39 3.19 5.78 6.84 3.88 1.62 4.16 
XP 001351079. 4.63 3.13 9.26 9.12 3.09 645 1.82 4.81 L67| 3.27 04 6.9 4.22 3.77 3.5| 5.72] 6.81] 3.86 1.5 3.45 
XP 001351080. 3.94 3.08 8.28 951 2.94 6.75 1.54 4.75 118 5.61 086 7.42 5.34 3.62 3.67 6.29 5.98 4.39 1.36 3.49 
XP 001351319. 4.46 348 8.62 8.47 2.99 7.41 1.89 4.57 11.6 5.18 13° 6.73, 4.61 3.63 3.1 6.54 6.84 3.7 1.44 3.63 
XP 001351321. 4.04 3.35 8.16 9.09 2.86 6 167 4.85 1.36 5.62 Al 8.13 5.13 3.29 401 586 56 3.61 1.76 4.21 
XP 001351435. 3.94 3.3, 8.69 8.47] 2.85) 7.11) 1.99) 5.25 1.77 4.93 18 6.88 4.48 3.89 3.26 7.06 5.79 3.62 1.49 4.03 
XP 001351437. 4.66 3.03 8.48 8.3. 2.59 7.77 1.84 4.44 10.5 5.93 AQ 7.42 4) 3.95) 3.51) 6.54) 6.59) 3.91} 1.19| 3.86 
XP 001351438. 4.07 2.94 9.26 7.57 2.9) 7.27) 2.12 4.84 0.16 5.97 38 7.27 5.23 4.41 3.42 6.36 5.75 3.68 1.38 4.02 
XP 001351439. 4) 2.92) 9.2 7.82, 2.92 7.18) 2.19 4.9 0.19 5.85 25 7.22) 5.25 447 3.35 6.28 5.72 3.83 1.38 4.08 
XP 001351513. 3.07| 3.02 8.5 9.85| 3.39) 6.32) 2.09) 5.39 171) 3.67 21 7.62 4.18 3.62) 3.53) 6.46) 5.34) 3.62 1.49 3.9 
XP 001351514. 3.41 3.04 8.5 9.5 3.04 6.77 2 4.45 213) 3.86 09 7.63 445 3.23 3.54 654 5.72 4.18 1.41 3.5 
XP 001351515. 3.26 3.03 8.07 9.54 3.21 656 2.29 4.54 2.01 6.05 0.92 7.98 4.45 3.03 3.16 6.37 6.69 3.9) 1.38] 3.58 
XP 001351517. 4.1 3.28 8.8 8.89 3.1 7.07 1.73 4.65 1.49 3.2 23 7.39 4.1 3.83 3.15 6.16 6.61 3.92 1.5 3.78 
XP 001351561. 4.78 346 7.85 8.67 2.65 6.42 2| 5.12 157 5.4 27° 7.57 4.36 3.41, 3.77, 6.14) 6.19) 3.57) 1.77 4.02 
XP 001351564. 3.72, 3.13. 7.52 10.33. 2.63 6.21 1.54 4.98 1.01 5.48 04 6.62 5.21 4.08 3.44 662 7.39 3.67 14 3.99 
XP 000965997. 441 3.61 7.95 7.92 3.2 4.52 1.88 5.59 1.22| 5.97 .77| 8.06] 3.89] 3.72] 3.86] 5.9) 6.22) 3.86 1.7 4.76 
XP 000965999. 3.4 2.49 8.08 717 2.79 4.83 2.04 6.34 974 611 2.72 114 3.62 2.72 4.08 649 4.91 4.83 1.58 4.68 
XP 000966160. 4.43, 3.59 7.98 8.86 3.43 5.81 2.34 5.72 2.32 6.06 .29| 7.23) 4.39 3.8 3.26 543 6.02 2.97 1.71 3.38 
XP 000966305. 3.89] 3.62 8.3 8.22] 3.26) 5.21) 2.05) 5.54 1,73] 3.87 39 7.76 4.05 3.74 3.31 6.5] 5.77] 3.26) 1.77) 4.75 
XP 000966307. 4.11 3.04 8.4 8.71 2.99 697 1.92 5.45 1.26 5.72 0.94 6.79 4.47 4.2 3.84 5.81 643 3.84 1.34 3.75 
XP 001348946. 4.78 346 7.85 8.67 2.65 6.42 2| 5.12 1.57 5.4 27° 7.57 4.36 3.41 3.77 6.14 6.19 3.57 1.77 4.02 
XP 001349032. 3.62] 3.09) 8.69 8.65 3. 7.01 1.85 5.43 0.54 5.69 24 6.93 4.63 3.79 3.48 7.72 5.82. 3.53 141 3.88 
XP 001349035. 3.44. 2.95 8.31 8.71 2.99 6.16 2.64 5.94 0.67 5.58 25 8.22 487 4.11 3.44 6.25 6.16 2.99 1.43 3.89 
XP 001349036. 4.11 3.28 8.39 9.49 2.6 7.12 1.87 4.88 1,33) 3.11 Al 7.03 4.24 3.79 3.6 661 5.89 4.01 1.28 4.15 
XP 001349219. 3.58| 3.39] 7.86 8.67| 3.08) 6.55) 1.81) 5.32 1.95 5.82 0.89 74 4.01 3.16 3.08 6.78 7.44 4.01 1.31 3.89 
XP 001349030. 3.61 2.98 8.04 8.67 3.02 6.95 14 4.51 1.24 5.55 14 853 4.97 4.24 2.71 6.05 7) 3.93 14 3.79 
XP 001349031. 4.04 2.86 8.62 9.36 3.38 6.15 1.98 5.05 1.52, 5.41 .27| 7.43) 4.13) 4.18 3.6 646 5.71 3.43 1.32 4.09 
XP 001349033. 4.75 3.44 7.95 8.24 3.34 5.38 2.62 5.52 0.27| 3.357 55° 7.95 4.07 446 3.68 5.81 5.33 3.63 1.94 4.51 
XP 001349034. 4.42 3.21 8.49 9.16 3.21 6.77 1.9 5.15 1.38 5.73 44. 745 4.24 3.57 3.16 5.73) 6.14 3.88 1.35 3.61 
XP 001349434. 4.13, 3.22) 8.77 8.44 3.37 5.78 1.85 Ae) 0:67 | 3.55 09 7.63 5.07 3.94 3.46 65 6.21 3.27 1.52 4.03 
XP 001349437. 3.77 3.14 8.96 10.49 2.91 6.32 1.93 4.57 2.33 4.98 48 7.31 3.99 3.72 3.45 5.65 6.1 3.81 1.34 3.77 
XP 001349438. 4.02 2.91 8.79 8.12] 2.83) 6.49 2.3| 3.08 0.95 5.65 AS) 7.37) 3.42) 3.71) 3.13) 7.15) 6.36) 3.71) 1.37 3.8 
XP 001349512. 4.43, 3.59 7.75 8.83 3.02 5.64 1.81 5.17 1,31) 3.57 jl 7.82) 5.03 3.62 3.05 648 5.77 3.93 1.64 4.43 
XP 001349513. 4.09 3.88 7.98 7.8 3.5| 4.76) 1.71) 5.53 2.39 6.16 AT 889 444 3.32 3.64 5.63 5.32 3.57 1.82 4.09 
XP 001349514. 4.1 3.21 8.96 8.77, 2.97, 6.04) 1.75. 5.14 10.8 5.71 0.99 6.6 5.38 4.06 3.35 6.37 6.27 4.01 1.46 4.06 
XP 001351877. 4.01 2.91 8.77 8.59 2.86 6.65 1.72 3.2 1.37| 3.37 32 656 4.71 4.27 3.66 6.96 5.81 3.83 1.45 3.96 
XP 001352242. 3.71) 2.92, 8.97 9.14 2.74 654 1.77 5.26 1.66 5.65 37| 6.93 4.2 345 3.62 636 663 3.53 141 4.15 
XP 002585487. 4.62 2.93 8.73 8.64 2.97 6.86 1.78 4.8 0.92| 5.39 .28| 7.54) 4.25) 3.84) 3.52) 7.27) 5.99) 3.79} 1.37] 3.52 
XP 001349738. 3.7 3.1 7.96 9.9| 2.68) 5.92) 1.67 5 1.05] 5.18 34° 7.12 5 3.75 3.33 6.01 7.03 4.39 148 4.39 
XP 001350409. 4.05 3.24 9.18 8.68 3.1 6.79 1.66 4.72 1.74 5.67 0.94 7.29 4.41 3.46 3.1 666 5.8 4.05 1.44 4 
XP 001349740. 3.26 3.68 8.19 9.41] 3.23) 5.71) 1.79) 5.29 1.78 5.68 49° 7.53 4.66 3.56 3.77 6.28 5.26 3.89 1.76 3.8 
Avg 4.05 3.24 8.4 8.79 3 6.33 1.91 5.1 11.34 5.6 1.28 7.51 4.55 3.75 3.46 6.32 6.12 3.75 1.52 3.99 


a) Number of hydrophilic amino acids is high. 


b) All the data given here shows the percent composition. 
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DISCUSSION 

This study indicates that Plasmodium vivax and 
Plasmodium falciparum, both species are highly 
diverse at genomic level and this difference also 
reflects in their diverse functional role, as malaria 
parasites. The number of chromosome is same in both 
the species; P.vivax has a comparable larger genome 
size [10]. But there is remarkable difference in their 
chromosome lengths when considered separately. 
Some chromosomes of P. vivax are larger than the 
length of same chromosomes of P. falciparum and vice 
versa. However the major difference in their genome is 
their percent A+T content which is ~80 % in 
P.falciparum. Surprisingly, there is a significant 
distributional difference of A+ T nucleotides; as in 
P.vivax high A + T content is seen in the sub-telomeric 
regions [10], while in P. falciparum, the distribution is 
almost even in all the chromosome locations [11,26]. 
Since the A+T content is quite high in P.falciparum 
and considering high virulence property of the parasite, 
it may be possible that high A+T content will provide 
the base for remarkably high antigenic variation. This 
is because the fixation probability of alleles rich in 
A+T nucleotides are lower than G+C rich alleles and 
genes possessing antigenic and _ cyto-adherence 
properties maintain high allelic diversity without going 
to fixation and maintained by balancing selection [27]. 
A large number of N bases are distributed all over the 
chromosomes and their numbers goes on increasing as 
the chromosome length increase. We aim to analyze 
the role of the N’s in context of genomic studies. 


The length and nucleotide composition of vir and var 
genes are quite dissimilar. Hence their antigenic 
variation is also different and this may lead to 
difference in invasiveness. Generally the chromosome 
length is affected by the frequency of occurrence of 
tandem repeats. Overall numbers of tandem repeats are 
more in case of P. falciparum when compared with 
that of P. vivax. Although in both the species 
chromosome lengths are strongly correlated with 
tandem repeats and their correlation coefficients are 
nearly equal, but there is difference in the slope of the 
line of the best fit [Figure 1], which indicates that 
tandem repeats per unit length of chromosome is 
higher in P.falciparum in comparison to P.vivax. As 
we already know that the number of repeats is directly 
proportional to the relative simplicity factor (RSF), so 
high RSF value helps the organisms to avoid host’s 
immune response [28, 29]. 


Although there is difference between the Phylogenetic 
tree of Cytochrome b and HMGB proteins, one point 
can be clearly predicted that both the species 
P.falciparum and P.vivax arose in a separate lineage 
from the common ancestor. Considering the HMGB 
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tree, we can say that since HMGB is a very important 
transcription factor required for the survival of these 
species, hence by blocking the transcription process by 
some appropriate ligand blocking HMGB will halt the 
parasite’s survival. This strategy will be helpful in drug 
development. The conservation level of HMGB 
proteins is very low among P.falciparum and P.vivax 
as shown in Phylogenetic tree (Fig.2 b), so a single 
drug will not work against both. Hence designing 
different drugs for P.falciparum and P.vivax may be a 
necessary strategy. 


CONCLUSION 

This study found many low complexity regions 
(LCR’s) in Plasmodium falciparum. These LCR’s have 
been shown to possess unfolded coil structures [30]. 
LCR are disordered regions comprising mostly of 
hydrophilic and low molecular weight amino acids, 
which impart flexibility with respect to folding and the 
resulting protein structure is therefore adaptable to 
ligand binding when  required[31]. Natively 
unfolded/disordered regions in proteins confer 
considerable functional advantage as they allow 
efficient interaction with several different regions [32]. 
In context of possible drug targets, lower complexity 
regions, which were discarded before, will be taken 
into consideration and analyzed intensively in our next 
study. 
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